Border Noise Removal of Camera-Captured Document Images Using Page Frame Detection

نویسندگان

  • Syed Saqib Bukhari
  • Faisal Shafait
  • Thomas M. Breuel
چکیده

Camera-captured document images usually contain two main types of marginal noise: textual noise (coming from neighboring pages) and non-textual noise (resulting from the page surrounding and/or binarization process). These types of marginal noise degrade the performance of the preprocessing (dewarping) of camera-captured document images and subsequent document digitization/recognition processes. Page frame detection is one of the newly investigated areas in document image processing, which is used to remove border noise and to identify the actual content area of document images. In this paper, we present a new technique for page frame detection of camera-captured document images. We use text and nontext contents information to find the page frame of document images. We evaluate our algorithm on the DFKI-I (CBDAR 2007 Dewarping Contest) dataset. Experimental results show the effectiveness of our method in comparison to other stateof-the-art page frame detection approaches. Keywords-Border Noise Removal, Page Frame Detection, Camera-Captured Document Images

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Document Image Dewarping Based on Text Line Detection and Surface Modeling (RESEARCH NOTE)

Document images produced by scanner or digital camera, usually suffer from geometric and photometric distortions. Both of them deteriorate the performance of OCR systems. In this paper, we present a novel method to compensate for undesirable geometric distortions aiming to improve OCR results. Our methodology is based on finding text lines by dynamic local connectivity map and then applying a l...

متن کامل

Automatic Borders Detection of Camera Document Images

When capturing a document using a digital camera, the resulting document image is often framed by a noisy black border or includes noisy text regions from neighbouring pages. In this paper, we present a novel technique for enhancing the document images captured by a digital camera by automatically detecting the document borders and cutting out noisy black borders as well as noisy text regions a...

متن کامل

Page Frame Detection for Marginal Noise Removal from Scanned Documents

We describe and evaluate a method to robustly detect the page frame in document images, locating the actual page contents area and removing textual and non-textual noise along the page borders. We use a geometric matching algorithm to find the optimal page frame, which has the advantages of not assuming the existence of whitespace between noisy borders and actual page contents, and of giving a ...

متن کامل

Camera-Based Document Image Mosaicing

In this paper we present an image mosaicing method for camera-captured document images. Our method is unique in not restricting the camera position, thus allowing greater flexibility than scanner-based or fixed-camera-based approaches. To accommodate for the perspective distortions introduced by varying poses, we implement a two-step image registration process that relies on accurately computin...

متن کامل

Camera-Based Document Image Mosaicing

In this paper we present an image mosaicing method for camera-captured document images. Our method is unique in not restricting the camera position, thus allowing greater flexibility than scanner-based or fixed-camera-based approaches. To accommodate for the perspective distortions introduced by varying poses, we implement a two-step image registration process that relies on accurately computin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011